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Escherichia coli K12. 
Escherichia coli K12 
Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae; 
Escherichia . 

1 (bases 1 to 11033) 

Blattner, F.R. , Plunkett, G. Ill, Bloch,C.A., Perna,N.T., Burland,V., 
Riley, M. , Collado-Vides , J. , Glasner, J. D. , Rode,C.K., Mayhew,G.F., 
Gregor,J., Davis, N.W., Kirkpatrick, H . A. , Goeden,M.A., Rose, D.J. , 
Mau, B . and Shao,Y. 

The complete genome sequence of Escherichia coli K-12 

Science 277 (5331), 1453-1474 (1997) 

97426617 

9278503 

2 (bases 1 to 11033) 
Blattner, F.R. 
Direct Submission 

Submitted ( 1 6- JAN-1997 ) Guy Plunkett III, Laboratory of Genetics, 
University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA. 
Email: ecoli@genetics.wisc.edu Phone: 608-262-2534 Fax: 
608-263-7459 

3 (bases 1 to 11033) 
Blattner, F.R. 
Direct Submission 

Submitted (02-SEP-1997 ) Guy Plunkett III, Laboratory of Genetics, 
University of Wisconsin, 445 Henry Mall, Madison, WI 53706, USA. 
Email: ecoli@genetics.wisc.edu Phone: 608-262-2534 Fax: 
608-263-7459 

4 (bases 1 to 11033) 
Plunkett, G. III. 
Direct Submission 

Submitted ( 13-OCT-1998 ) Laboratory of Genetics, University of 
Wisconsin, 445 Henry Mall, Madison, WI 53706, USA 

This sequence was determined by the E. coli Genome Project at the 
University of Wisconsin-Madison (Frederick R. Blattner, director) . 
Supported by NIH grants HG00301 and HG01428 (from the Human Genome- 
Project and NCHGR) . The entire sequence was independently 
determined from E. coli K12 strain MG1655. Predicted open reading 
frames were determined using GeneMark software, kindly supplied by 
Mark Borodovsky, Georgia Institute of Technology, Atlanta, GA, 
30332 [e-mail: mark@amber.gatech.edu]. Open reading frames that 
have been correlated with genetic loci are being annotated with CG 
Site Nos., unique ID nos . for the genes in the E. coli Genetic 
Stock Center (CGSC) database at Yale University, kindly supplied by 
Mary Berlyn. A public version of the database is accessible 
(h ttp: / /cgsc. biology . yale . edu ) . Annotation of the genome is an 
ongoing task whose goal is to make the genome sequence more useful 



1 of 8 



10/26/02 3:24 PM 



NCBI Sequence Viewer 



http://wMw.ncbi.nlm.nih.go 



by correlating it with other data. Comments to the authors are 
appreciated. Updated information will be available at the E. coli 
Genome Project's World Wide Web site 

(h ttp://www.genetics.wisc.edu) . *** The E. coli K12 sequence and 
its annotations are periodically updated; this is version M54. No 
sequence changes. Annotation updates: updated gene identifications 
and products; all new functional assignments courtesy of Monica 
Riley; added promoters, protein binding sites, and repeated 
sequences described in reference 1. The unique numeric identifiers 
beginning with a lowercase 'b' assigned to each gene (protein- or 
RNA-encoding) are now designated as gene synonyms instead of 
labels. This should allow them to be searched for in Entrez as gene 
names . 

FEATURES Location/Qualifiers 
source 1 . . 11033 

/organism="Escherichia coli K12" 

/strain="K12" 

/sub_strain="MG1655" 

/db_xref="taxon: 83333" 
protein bind <1..2 

/note="central position to predicted promoter: -44.5" 

/bound_moiety="RhaS predicted site" 
promoter 7 . . 33 

/note="f actor Sigma70; promoter sodB; documented +1 

atl733347" 
gene 94 . .675 

/gene="sodB" 

/note="bl656" 
CDS 94. .675 

/gene="sodB" 

/EC number^" 1.15. 1.1 " 

/ f unction="enzyme; Detoxification" 

/note="ol93; 100 pet identical to SODF_ECOLI SW: P09157 

but includes initiator met; CG Site No. 15256; alternate 

name sodF" 

/ codon_start=l 

/transl table=ll 

/product="superoxide dismutase, iron" 
/protein id=" AAC74728 . 1 " 
/db_xref="GI: 1787946" 

/translation="MSFELPALPYAKDALAPHISAETIEYHYGKHHQTYVTNLNNLIK 

GTAFEGKS LEE 1 1 RS SEGGVFNNAAQVWNHT FYWNCLAPNAGGE PTGKVAEAI AAS FG 

SFADFKAQFTDAAIKNFGSGWTWLVKNSDGKLAIVSTSNAGTPLTTDATPLLTVDVWE 

HAY Y I D Y RN AR P G Y L E H FW AL VN WE F V AKN L AA " 
repeat region 741. .772 

/note="REP (repetitive extragenic palindromic) element; 

contains 1 REP sequence" 
gene complement (837 . . 2006) 

/gene="bl657" 
CDS complement (837 . .2006) 

/gene="bl657" 

/function^ "putative transport; Not classified" 
/note="f389; residues 10-254 are 34 pet identical to 
aa8-252 from ARAJ_ECOLI SW: P23910 (394 aa) " 
/codon_start=l 

/transl_table=n r 
/product="putative transport protein" 
/protein id=" AAC74729 . 1 " 
/db_xref="GI: 1787947" 

/translation="MKINYPLLALAIGAFGIGTTEFSPMGLLPVIARGVDVSIPAAGM 
LISAYAVGVMVGAPLMTLLLSHRARRSALIFLMAIFTLGNVLSAIAPDYMTLMLSRIL 
TSLNHGAFFGLGSVVAASVVPKHKQASAVATMFMGLTLANIGGVPAATWLGETIGWRM 
SFLATAGLGVISMVSLFFSLPKGGAGARPEVKKELAVLMRPQVLSALLTTVLGAGAMF 
TLYTYISPVLQSITHATPVFVTAMLVLIGVGFSIGNYLGGKLADRSVNGTLKGFLLLL 
MVIMLAIPFLARNEFGAAISMVVWGAATFAVVPPLQMRVMRVASEAPGLSSSVNIGAF 
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NLGNALGAAAGGAVISAGLGYSFVPVMGAIVAGLALLLVFMSARKQPETVCVANS" 
promoter complement (2099, .2127) 

/note="factor Sigma70; predicted +1 start at 1735400" 
promoter 2367. .2395 

/note=" factor Sigma70; promoter purR; documented +1 

atl735713" 
protein bind 2500. .2516 

/note="central position to purR promoter : 103 . 5 " 

/bound_moiety="PurR documented site" 
gene 2560. . 3585 

/ gene="purR" 

/note="bl658" 
CDS 2560. .3585 

/ gene="purR" 

/f unction="regulator; Purine ribonucleotide biosynthesis" 
/note="o341; 100 pet identical to PURR_ECOLI SW: P15039; 
CG Site No. 17989" 
/codon_start=l 
/transl table=ll 

/product="transcriptional repressor for pur regulon, glyA, 
glnB, prsA, speA" 
/protein id=" AAC74730 . 1 " 
/db_xref="GI : 1787948" 

/translation="MATIKDVAKRANVSTTTVSHVINKTRFVAEETRNAVWAAIKELH 
YSPSAVARSLKVNHTKSIGLLATSSEAAYFAEIIEAVEKNCFQKGYTLILGNAWNNLE 
KQRAYLSM^4AQKRVDGLLVMCSEYPEPLLAMLEEYRHIPMVV^3DWGEAKADFTDAVID 
NAFEGGYMAGRYLIERGHREIGVIPGPLERNTGAGRLAGFMKAMEEAMIECVPESWIVQ 
GDFEPESGYRAMQQILSQPHRPTAVFCGGDIMAMGALCAADEMGLRVPQDVSLIGYDN 
VRNARYFTPALTTIHQPKDSLGETAFNMLLDRIVNKREEPQSIEVHPRLIERRSVADG 
PFRDYRR" 

gene complement (3582 . . 4514 ) 

/gene="ydhB" 

/note="bl659" 
CDS complement (3582. .4514) 

/gene="ydhB" 

/function="putative regulator; Not classified" 
/note="f310; 100 pet identical to YDHB_ECOLI SW: P37598" 
/codon_start=l 
/transl table=ll 

/product="putative transcriptional regulator LYSR-type" 
/protein id=" AAC74731 . 1 " 
/db_xref-"GI : 1787949" 

/trans lation="MWSEYSLEVVDAVARNGSFSAAAQELHRVPSAVSYTVRQLEEWL 
AVPLFERRHRDVELTAAGAWFLKEGRSVVKKMQITRQQCQQIANGWRGQLAIAVDNIV 
RPERTRQMIVDFYRHFDDVELLVFQEVFNGVWDALSDGRVELAIGATRAIPVGGRYAF 
RDMGMLSWSCVVASHHPLALMDGPFSDDTLRNWPSLVREDTSRTLPKRITWLLDNQKR 
VVVPDWESSATCISAGLCIGMVPTHFAKPWLNEGKWVALELENPFPDSACCLTWQQND 
MSPALTWLLEYLGDSETLNKEWLREPEETPATGD" 

promoter complement (4550. . 4578) 

/note="factor Sigma70; predicted +1 start at 1737851" 

promoter 4558. .4586 

/note="factor Sigma70; predicted +1 start at 1737901" 

gene 4627 . .5838 

/gene="ydhC"- 
/note^bl.6.6.0 " 

CDS ff6'2 7 ^,5.83 S^J 

/fgene^ydHc^7 

/function="putative transport; Not classified" 
/note="o403; This 403 aa ORF is 90 pet identical (7 gaps) 
to 383 residues of a 400 aa protein YDHC__ECOLI SW: P37597" 
/codon_start=l 
/transl table=ll 

/product="putative transport protein" 
/protein id=" AAC74732 . 1 " 
/db xref="GI: 1787950" 
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/ trans lation= ,T MQPGKRFLVWLAGLSVLGFLATDMYLPAFAAIQADLQTPASAVS 
ASLSLFLAGFAAAQLLWGPLSDRYGRKPVLLIGLTIFALGSLGMLWVENAATLLVLRF 
VQAVGVCAAAVIWQALVTDYYPSQKVNRIFAAIMPLVGLSPALAPLLGSWLLVHFSWQ 
AIFATLFAITVVLILPIFWLKPTTKARNNSQDGLTFTDLLRSKTYRGNVLIYAACSAS 
FFAWLTGSPFILSEMGYSPAVIGLSYVPQTIAFLIGGYGCRAALQKWQGKQLLPWLLV 
L FAVS VI ATWAAG FI S H VS L VE I L I P FC VMAI ANGAI Y P I VVAQALRP FPHATGRAAA 
LQNTLQLGLCFLASLVVSWLISISTPLLTTTSVMLSTVMLVALGYMMQRCEEVGCQNH 
GNAEVAHSESH" 
protein bind 4629. .4645 

/gene="ydhC" 

/note="central position to predicted promoter : 45 . 5" 

/bound_moiety="RhaS predicted site" 
promoter 6008.. 6037 

/note="factor Sigma70; predicted +1 start at 1739352" 
gene 6129.. 7277 

/gene="cfa" 

/note="bl661" 
CDS 6129.. 7277 

/gene="cf a" 

/EC number=" 2.1.1.7 9 " 

/f unction="enzyme; Fatty acid and phosphatidic acid 
biosynthesis" 

/note="o382; 100 pet identical to CFA_ECOLI SW: P30010; CG 
Site No. 10810; alternate name cdfA" 
/codon_start=l 
/transl table=ll 

/product="cyclopropane fatty acyl phospholipid synthase" 
/protein id=" AAC74733. 1 " 
/db_xref="GI: 1787951" 

/trans la tion="iyiSSSCIEEVSVPDDNWYRIANELLSRAGIAINGSAPADIRVKNP 
DFFKRVLQEGSLGLGESYMDGWWECDRLDMFFSKVLRAGLENQLPHHFKDTLRIAGAR 
LFNLQSKKRAWIVGKEHYDLGNDLFSRMLDPFMQYSCAYWKDADNLESAQQAKLKMIC 
EKLQLKPGMRVLDIGCGWGGLAHYMASNYDVSVVGVTISAEQQKMAQERCEGLDVTIL 
LQDYRDLNDQFDRIVSVGMFEHVGPKNYDTYFAVVDRNLKPEGIFLLHTIGSKKTDLN 
VDPWINKYIFPNGCLPSVRQIAQSSEPHFVMEDWHNFGADYDTTLMAWYERFLAAWPE 
IADNYSERFKRMFTYYLNACAGAFRARDIQLWQVVFSRGVENGLRVAR" 

gene complement ( 7317 . .7958) 

/gene="ribE" 
/note="bl662" 

CDS complement (7317. .7958) 

/gene="ribE" 
/EC number^ " 2. 5. 1.9 " 

/function="enzyme; Biosynthesis of cofactors, carriers: 
Riboflavin" 

/note="f213; 100 pet identical to RISA_ECOLI SW: P29015; 
CG Site No. 11923" 
/codon_start=l 
/transl table=ll 

/product="riboflavin synthase, alpha chain" 
/protein id=" AAC7 4 734 . 1 " 
/db_xref="GI : 1787952" 

/ translation="MFTGIVQGTAKLVSIDEKPNFRTHVVELPDHMLDGLETGASVAH 
NGCCLTVTEINGNHVSFDLMKETLRITNLGDLKVGDWVNVERAAKFSDEIGGHLMSGH 
IMTTAEVAKILTSENNRQIWFKVQDSQLMKYILYKGFIGIDGISLTVGEVTPTRFCVH 
LIPETLERTTLGKKKLGARVNIEIDPQTQAVVDTVERVLAARENAMNQPGTEA" 

promoter complement {7998 . . 8026) 

/note="factor Sigma70; predicted +1 start at 1741299" 

promoter 8111. .8140 

/note="factor Sigma70; predicted +1 start at 1741455" 

gene 8173.. 9546 

/gene="ydhE" 
/note="bl663" 

CDS 8173. .9546 

/gene="ydhE" 

/function="putative transport; Not classified" 



4 of 8 



10/26/02 3:24 PM 



NCBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov:80/ent...ide&list_uids=1787945&dopr^ 



/note= ,r o457; 99 pet identical to fragment YDHE_ECOLI 

SW:P37340 (230 aa) but contains 227 additional C-ter 

residues" 

/codon_start=l 

/transl table=ll 

/product="putative transport protein" 
/protein id=" AAC74735 . 1 " 
/db_xref="GI : 1787953" 

/translation="MQKYISEARLLLALAIPVILAQIAQTAMGFVSTVMAGGYSAT DM 
AAVAIGTSIWLPAILFGHGLLLALTPVIAQLNGSGRRERIAHQVRQGFWLAGFVSVLI 
MLVLWNAGYIIRSMENIDPALADKAVGYLRALLWGAPGYLFFQVARNQCEGLAKTKPG 
MVMGFIGLLVNIPVNYIFIYGHFGMPELGGVGCGVATAAVYWVMFLAMVSYIKRARSM 
RDIRNEKGTAKPDPAVMKRLIQLGLPIALALFFEVTLFAVVALLVSPLGIVDVAGHQI 
ALNFSSLMFVLPMSLAAAVTIRVGYRLGQGSTLDAQTAARTGLMVGVCMATLTAIFTV 
SLREQIALLYNDNPEVVTLAAHLMLLAAVYQISDSIQVIGSGILRGYKDTRSIFYITF 
TAYWVLGLPSGYILALTDLVVEPMGPAGFWIGFIIGLTSAAIMMMLRMRFLQRLPSAI 
ILQRASR" 

gene complement (9587. .10843) 

/gene="bl664" 
CDS complement (9587 . . 10843) 

/gene="bl664" 

/function="putative enzyme; Not classified" 

/note="f418; This 418 aa ORF is 31 pet identical (50 gaps) 

to 391 residues of an approx. 8 64 aa protein YEJO_ECOLI 

SW: P33924" 

/codon_start=l 

/transl table=ll 

/product="possible enzyme" 

/protein id=" AAC74736 . 1 " 

/db_xref="GI : 1787954" 

/translation="MGSDAKNLMSDGNVQIVKTGEVIGATQLTEGELIVEAGGRAENT 
VVTGAGWLKVATGGIAKCTQYGNNGTLSVSDGAIATDIVQSEGGAISLSTLATVNGRH 
PEGEFSVDKGYACGLLLENGGNLRVLEGHRAEKIILDQEGGLLVNGTTSAVVVDEGGE 
LLVYPGGEASNCEINQGGVFMLAGKASDTLLAGGTMNNLGGEDSDTIVENGSIYRLGT 
DGLQLYSSGKTQNLSVNVGGRAEVHAGTLENAVIQGGTVILLSPTSADENFVVEEDRA 
PVELTGSVALLDGASMIIGYGAELQQSTITVQQGGVLILDGSTVKGDSVTFIVGNINL 
NGGKLWLI TDAATHVQLKVKRLRGEGAI CLQTSAKE I S PDFINVKGEVTGDI H VE I TD 
ASRQTLCNALKLQPDEDGIGATLQPA" 
promoter complement (10887 . . 10914) 

/note="factor Sigma70; predicted +1 start at 1744188" 
promoter complement (10926. . 10952) 

/note="factor Sigma32; predicted +1 start at 1744227" 

BASE COUNT 2627 a 2894 c 2660 g 2852 t 

ORIGIN 

1 tcatcttttg tctcaccttt taatttgeta ccctatccat aegcacaata aggctattgt 
61 aegtatgeaa attaataata aaggagagta gcaatgtcat tcgaattacc tgcactacca 
121 tatgetaaag atgctctggc accgcacatt tetgeggaaa ccatcgagta tcactacggc 
181 aagcaccatc agacttatgt cactaacctg aacaacctga ttaaaggtac cgcgtttgaa 
241 ggtaaatcac tggaagagat tattegcage tctgaaggtg gcgtattcaa caacgcagct 
301 caggtctgga accatacttt ctactggaac tgcctggcac cgaacgccgg tggcgaaccg 
361 actggaaaag tegctgaage tatcgccgca tcttttggca getttgeega tttcaaagcg 
421 cagtttactg atgeagegat caaaaacttt ggttctggct ggacctggct ggtgaaaaac 
481 agegatggea aactggctat cgtttcaacc tetaacgegg gtactccgct gaccaccgat 
541 gcgactccgc tgctgaccgt tgatgtctgg gaacacgett attacatcga etategcaat 
601 gcacgtcctg gctatctgga gcacttctgg gcgctggtga actgggaatt cgtagcgaaa 
661 aatctcgctg cataataact gatggcaaat geagcattge ctgaageget aegcttatea 
721 ggcctacgcg gatcatcgat gtaggtcgga taaggcactc gccgcatccg gcaagataaa 
781 tcgcacgttg tcagcaactg taacgcagaa ggttatcctt ctgcgttttt gtttaattag 
841 ctgttagcaa cgcaaactgt ttcaggttgt tttctggctg acataaacac cagcaataat 
901 gccagtcccg cgacaatcgc tcccatcacc ggcacaaagc tgtatcccag cccagcggaa 
961 .attaccgcac caccagcagc tgctcccagc gcatttccaa gattaaaggc accaatattg 
1021 actgatgaag acagacctgg cgcttcactg gcgacacgca tcacgcgcat ctgtaacggc 
1081 ggtacgaccg caaaggttgc tgcgccccac accaccatgc taatagctgc gccgaactca 
1141 ttgegggeca ggaaegggat tgecagcata atcaccatca acaacaacaa aaagecttte 
1201 aacgtgccgt taactgaacg atetgecagt ttgccgccga gatagttacc gatagagaat 
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1261 ccgacaccaa tcagcaccag cattgccgtg 
1321 tgcagtaccg gagagatata ggtgtagaga 
1381 gtcagcaatg cagacagcac ctgcggacgc 
1441 ggtcgtgccc ctgcaccacc tttaggtaat 
1501 cccagccccg ccgttgccag aaatgacatc 
1561 gccgccggca cgccaccgat atttgccagg 
1621 ctggcctgtt tatgttttgg caccacgctt 
1681 gctccgtgat tcaggctggt caaaatgcgt 
1741 atggcggaaa gtacgttgcc gagcgtgaaa 
1801 cgggcacgat gagaaagtag aagcgtcatc 
1861 taggcactga ttaacattcc ggcagcggga 
1921 ggcaacaagc ccattggcga gaactccgtt 
1981 gccagcaacg gatagttaat tttcatgcct 
2041 attcattaaa gtggcgaaag catgacagca 
2101 aaaacacttt tgccattttg ctaacaaaca 
2161 tcgagaggaa atcagtgcag cgcggcagtc 
2221 ataatcgttg ttaccagtga aaatttaagg 
2281 ttaccacaca aaaagtgata ttacgcattt 
2341 atttataatg ataagtgttt ttaccacttc 
2401 cgcttacact atttgcgtac tggccattga 
2461 ggcgtaccgc aacacttttg ttgtgcgtaa 
2521 ttttgcagga gctgaagtta gggtctggag 
2581 gcgaaacgag caaacgtttc cactacaact 
2641 gtcgctgaag aaacgcgcaa cgccgtgtgg 
2701 agcgcggtgg cgcgtagcct gaaggttaac 
2761 agcagcgaag cggcctattt tgccgagatc 
2821 aaaggttaca ccctgattct gggcaatgcg 
2881 ctgtcgatga tggcgcaaaa acgcgtcgat 
2941 gagccgttgc tggcgatgct ggaagagtat 
3001 ggtgaagcaa aagctgactt caccgatgcg 
3061 atggccgggc gttatctgat tgaacgcggt 
3121 ctggaacgta acaccggcgc aggccgcctt 
3181 atgatcaagg tgccggaaag ctggattgtg 
3241 cgcgccatgc agcaaatcct gtcgcagccg 
3301 gatatcatgg caatgggcgc actttgtgct 
3361 gatgtttcgc tgatcggtta tgataacgtg 
3421 accacgatcc atcagccaaa agattcgctg 
3481 cgtatcgtca acaaacgtga agaaccgcag 
J3541 cgccgctccg tggctgacgg cccgttccgc 
3601 gtctcttccg gctcccgcag ccactcctta 
3661 agcaaccagg ttaacgcagg cgacatatca 
3721 tccggaaagg ggttttccag ttctaatgct 
3781 gcgaaatgtg ttggcaccat ccctatgcat 
3841 tcccagtcag gcacgacaac tcttttttgg 
3901 agcgttcgcg aggtgtcttc gcgcaccaac 
3961 aacgggccat ccatcaacgc cagcgggtgg 
4021 atccccatat cccggaaggc ataacgaccg 
4081 gccagttcca cgcgcccgtc ggaaagcgca 
4141 acaagaagtt cgacatcatc aaaatggcga 
4201 tctggcctga caatattatc cactgcgata 
4261 tgctgacatt gctggcgggt gatctgcatt 
4321 aaccacgctc cagcagcggt cagctccaca 
4381 gccagccact cttccagctg acgcacggta 
4441 tcctgtgccg cagcgctaaa actaccatta 
4501 tattctgacc acatagtctg cctgcaaaat 
4561 tcacaacact aatttcactc cctacacttt 
4621 atatacatgc aacctgggaa aagattttta 
4681 tttctggcaa ccgatatgta tctgcctgct 
4741 cctgcgtctg ctgtcagtgc cagccttagt 
4801 cttctgtggg ggccgctctc cgaccgttat 
4861 acaatttttg cgttaggtag tctggggatg 
4921 gtattgcgtt ttgtacaggc tgtgggtgtc 
4981 gtgacagatt attatccttc acagaaagtt 
5041 gtgggtctat ctccggcact ggctcctctg 
5101 tggcaggcga ttttcgccac cctgtttgcc 



acgaacaccg gtgttgcgtg ggtaatactt 
gtaaacattg caccagctcc cagtaccgtc 
attaataccg ccagctcttt tttcacttca 
gagaagaaca gacttaccat tgaaatcact 
cgccagccga tggtttcacc caaccaggtc 
gttaacccca taaacatagt ggcaactgcg 
gcggccacga ctgaacccaa accaaaaaat 
gaaagcatca gggtcatata atccggcgcg 
attgccatca ggaaaatcaa cgcactgcgg 
agcggcgcgc caaccattac gccaactgca 
atcgagacat ccacaccgcg cgcaatgacg 
gtcccgatac caaacgcgcc aatcgccagc 
tatctccacc tcttcgcgtc attacgcgat 
atcacaaaaa aatgaaaata acaaaaagag 
ggaaggagat gcgagggaga acgcgctccc 
aaacccacgg ctacgatcaa accgaggacg 
tcggtgctca tcaagttttc tcctttttta 
ttacacactg tgatgaaaaa atctcccgtc 
cccttttcgt caagatcggc caaaattcca 
ccccttcctg acgctccgtg tcgtttttcc 
ggtgtgtaaa ggcaaacgtt taccttgcga 
tgaaatggaa tggcaacaat aaaagatgta 
gtgtcacacg tgatcaacaa aacacgtttc 
gcagcgatta aagaattaca ctactcccct 
cacaccaagt ctatcggttt gctggcgacc 
attgaagcag ttgaaaaaaa ttgcttccag 
tggaacaatc ttgagaaaca gcgggcttat 
ggtctgctgg tgatgtgttc tgagtaccca 
cgccatatcc caatggtggt catggactgg. 
gtcattgata acgcgttcga aggcggctac 
caccgcgaaa tcggcgtcat ccccggcccg 
gccggtttta tgaaggcgat ggaagaagcg 
cagggtgact ttgaacctga atccggttat 
catcgcccta ctgccgtctt ctgtggtggc 
gctgatgaaa tgggcctgcg cgtcccgcag 
cgcaacgcgc gctattttac gccggcgctg 
ggtgaaacag cgttcaacat gctgttggat 
tctattgaag tgcatccgcg cttgattgaa 
gactatcgtc gttaatcacc cgttgcggga 
ttcagcgtct cactatcgcc gagatactca 
ttttgctgcc atgtcagaca acatgccgaa 
acccacttcc cctcattaag ccacggtttg 
aatcctgccg agatacaggt tgccgatgat 
ttatccagca accaggtaat acgtttaggt 
gacggccagt tgcgcaacgt atcatcgctg 
tgactggcaa caacgcaact ccagcttagc 
cctaccggaa tcgcgcgtgt tgcgccaatc 
tcccagacac cgttgaacac ttcctgaaag 
taaaaatcaa cgatcatctg ccgtgtacgt 
gctaactgac cgcgccagcc gttcgctatc 
tttttgacaa cagagcgccc ttctttgaga 
tcacggtgcc gtcgttcaaa gagcggcacc 
tagctgaccg cagaaggaac gcgatgcagc 
cgcgctaccg catcaacaac ttcgagtgaa 
ttttgaaacc agtcatcaaa tattaccgtt 
gcggcggtgt ttaattgaga gatttagaga 
gtctggctgg cgggtttgag cgtactcggt 
ttcgccgcca tacaggccga cctgcaaacg 
ctgttccttg ccggttttgc cgcagcccag 
ggtcgtaaac cggtattatt aatcggcctg 
ctgtgggtag aaaacgccgc tacgctgctg 
tgcgccgcgg cggttatctg gcaagcatta 
aaccgtattt ttgcggccat catgccgctg 
ttaggaagct ggctgctggt ccatttttcc 
attaccgtgg tgctgattct gcctattttc ' 
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5161 tggctcaaac ccacgacgaa ggcccgtaac 
5221 ctgctacgtt ctaaaaccta tcgcggcaac 
5281 ttttttgcat ggctgaccgg ttcaccgttc 
5341 gttattggtt taagttatgt cccgcaaact 
5401 cgcgccgcgc tgcagaaatg gcaaggcaag 
54 61 gctgtcagcg tcattgcgac ctgggctgcg 
5521 atcctgatcc cattctgtgt gatggcgatt 
5581 gcccaggcgc tgcgtccctt cccacacgca 
5641 cttcaactgg gtctgtgctt cctcgcaagt 
5701 acgccattgc tcaccaccac cagcgtgatg 
5761 tacatgatgc aacgttgtga agaagttggc 
5821 catagcgaat cacactgatc tatatcgata 
5881 gttgtatgat tgaaattagc ggcctatact 
5941 ttatgttttt acggggacag gatcgttccc 
6001 gttcctcctt tccctctgtt ctacgtcgga 
6061 tctcacaaag cccaaaaagc gtctacgctg 
6121 gagaaactat gagttcatcg tgtatagaag 
6181 gtatcgccaa cgaattactt agccgtgccg 
6241 atattcgtgt gaaaaacccc gattttttta 
6301 taggcgaaag ttatatggat ggctggtggg 
6361 aagtcttacg cgcaggtctc gagaaccaac 
6421 ttgccggcgc tcgtctcttc aatctgcaga 
6481 agcattacga tttgggtaat gacttgttca 
6541 cctgcgctta ctggaaagat gccgataatc 
6601 tgatttgtga aaaattgcag ttaaaaccag 
6661 ggggcggact ggcacactac atggcatcta 
6721 tttctgccga acagcaaaaa atggctcagg 
6781 tgctgcaaga ttatcgtgac ctgaacgacc 
6841 tcgagcacgt cggaccgaaa aattacgata 
6901 aaccggaagg catattcctg ctccatacta 
6961 atccctggat taataaatat atttttccga 
7021 ctcagtccag cgaaccccac tttgtgatgg 
7081 atactacgtt gatggcgtgg tatgaacgat 
7141 actatagtga acgctttaaa cgaatgttta 
7201 tccgcgcccg tgatattcag ctctggcagg 
7261 ttcgagtggc tcgctaaagg ctattctatc 
7321 gcttctgtgc ctggttgatt catggcattt 
7381 tctaccactg cctgagtttg tggatcgatt 
7441 ttcccaagag tcgtgcgttc cagtgtttcc 
7501 gtgacttcgc cgacggtcag gctaataccg 
7561 tatttcatca actgactatc ctggacttta 
7621 aatattttcg ccacttcagc agtggtcata 
7681 tcatcactga atttcgccgc acgctcaacg 
7741 agattggtaa tgcgtaacgt ttctttcatc 
7801 tccgtcacgg tcaggcagca accgttatgc 
78 61 agcatgtggt cgggtaactc caccacatgc 
7921 accagttttg cggtgccctg tacaatcccc 
7981 taagacattc tgttcagcac aatagcaggt 
8041 aatggctatt ttttcactgg agaattaata 
8101 tgcttcttct ttttgctgcc cattcaggcg 
8161 aaaggtgttc acgtgcagaa gtatatcagt 
8221 ccggtgattc tcgcgcaaat cgcccaaact 
8281 ggcggctata gtgccaccga catggcggcg 
8341 gcgatcctct ttggtcacgg actgctgctg 
8401 ggttccggtc gacgtgagcg cattgcgcat 
8461 tttgtttccg ttctcattat gctggtgctg 
8521 gaaaacatcg atccggctct ggcggacaaa 
8581 ggcgcgccgg gatatctgtt cttccaggtt 
8641 accaagccgg gtatggtaat gggctttatc 
8701 atctttattt atggtcattt cggtatgcct 
8761 actgcggcgg tgtattgggt catgttcctt 
8821 tccatgcgcg atattcgtaa cgaaaaaggc 
8881 cgactgattc aactcggttt gccgattgcg 
8941 gccgtcgtgg ctctgttagt gtctccgctc 
9001 gccctgaact ttagttcact aatgttcgtg 



aatagtcagg atggtctgac ctttaccgac 
gtgctgatat acgcagcctg ttcagccagt 
atccttagtg aaatgggcta cagcccggca 
atcgcgtttc tgattggtgg ttatggctgt 
cagttattac cgtggttgct ggtgctgttt 
ggcttcatta gccatgtgtc gctggtcgaa 
gccaatggcg cgatctaccc tattgttgtc 
actggtcgcg ccgcagcgtt gcagaacact 
ctggtagttt cctggctgat cagtatcagc 
ttatcaacag taatgctggt cgcgctgggt 
tgccagaatc atggcaatgc cgaagtcgct 
tacttatact taggctgcta acaaaatttt 
aatttcgagt tgttaaagct acgataaata 
gactcactat ggatagtcat ttcggcaagg 
ttatagactc gcggtttttt ctgcgagatt 
ttttaaggtt ctgatcaccg accagtgatg 
aagtcagtgt accggatgac aactggtacc 
gtatagccat taacggttct gccccggcgg 
aacgcgttct gcaagaaggc tctttggggt 
aatgtgaccg actggatatg ttttttagca 
tcccccatca tttcaaagac acgctgcgta 
gtaaaaaacg tgcctggata gtcggcaaag 
gccgcatgct tgatcccttc atgcaatatt 
tggaatctgc ccagcaggcg aagctcaaaa 
ggatgcgcgt actggatatt ggctgcggct 
attatgacgt aagcgtggtg ggcgtcacca 
aacgctgtga aggcctggat gtcaccattt 
agtttgatcg tattgtttct gtggggatgt 
cctattttgc ggtggtggat cgtaatttga 
tcggttcgaa aaaaaccgat ctgaatgttg 
acggttgcct gccctctgta cgccagattg 
aagactggca taacttcggt gctgattacg 
tcctcgccgc atggccagaa attgcggata 
cctattatct gaatgcctgt gcaggtgctt 
tcgtgttctc acgcggtgtt gaaaacggcc 
gccccctctc cgggggcgat ttcagatcag 
tctcgtgccg ccagcacacg ttctaccgta 
tcaatgttga cgcgtgcgcc aagttttttc 
ggaattaaat ggacgcaaaa acgcgttggc 
tcgatgccaa taaatccttt gtacagaata 
aaccagatct ggcgattatt ttctgaggtt 
atatgacctg acattaagtg tccgccaatt 
tttacccaat cccccacttt taaatcgcca 
aggtcaaaac tgacatggtt gccgttaatt 
gccacggaag caccggtttc caggccgtcc 
gtacgaaaat ttggtttctc gtcaatcgac 
gtaaacatac ttacaactcc tgaaatcagt 
ggaaaacgcc cttaccagtg aaggggtaag 
aatcctcgct acaatagact gaatttcccc 
gctttttagt ctctcatata actacaaata 
gaagcgcgtc tgttattagc attagcaatc 
gcgatgggtt ttgtcagtac cgtgatggcg 
gtcgctatcg gtacttctat ctggcttccg 
gcattaacgc cggttatcgc gcaattaaat 
caggtgcgac aaggtttctg gctggcaggt 
tggaatgcag gttacattat ccgctccatg 
gccgtgggtt atctgcgtgc gttgttgtgg 
gcccgtaacc agtgtgaagg tctggcaaaa 
ggcctgctgg tgaacatccc ggtgaactat 
gagctcggtg gcgttggttg tggcgtggct 
gccatggttt cttacattaa acgcgcccgc 
accgcaaaac ccgatcctgc ggttatgaaa 
ctggcactgt tctttgaagt gacactgttt 
ggtattgttg atgtcgcagg acaccagatt 
cttccaatgt cgctggcggc agcggtaact 



7 of 8 



10/26/02 3:24 PM 



J^CBI Sequence Viewer 



http://www.ncbi.nlm.nih.gov: 80/ent. . ,ide&list_uids= 1 787945&dopt=GenBank 



9061 
9121 
9181 
9241 
9301 
9361 
9421 
9481 
9541 
9601 
9661 
9721 
9781 
9841 
9901 
9961 
10021 
10081 
10141 
10201 
10261 
10321 
10381 
10441 
10501 
10561 
10621 
10681 
10741 
10801 
10861 
10921 
10981 



// 



atccgcgtag 
accgggctta 
cgggagcaaa 
ttgatgttgc 
attttgcgtg 
gtgctgggct 
gggccagcag 
atgttgcgta 
cgctaataaa 
gcgttgcgcc 
gactggcatc 
agtcaggtga 
gtttcacttt 
tcagattgat 
gaatcaacac 
caataatcat 
gatcttcctc 
caccttgtat 
tcacggacag 
gataaatgga 
tgccaccagc 
taatctcaca 
ccacgaccgc 
tttccgcgcg 
aggcataacc 
gcgtagagag 
cgctgaccga 
ccactttcaa 
caatcaactc 
gcacgttccc 
catctgtgtg 
atttgagatg 
cagatgtaaa 



gttatcgtct 
tggtgggtgt 
tcgccctgtt 
tggcggcggt 
gttataaaga 
tgccaagcgg 
gcttctggat 
tgcggttcct 
gacaaggcgc 
aatcccgtct 
tgttatctca 
aatttctttt 
cagttgcaca 
gttaccaaca 
accgccctgc 
tgaagcgccg 
tacgacaaaa 
taccgcgttt 
gttttgcgtc 
tccattctca 
aagcaacgta 
attgctggct 
tgaggttgtc 
atgtccttcc 
tttatcaacg 
actaattgcg 
tagcgtgcca 
ccagccagcc 
gccttcagta 
gtcgctcatc 
cctttattgc 
gtggcaaaca 
tgcgatcctg 



gggtcagggc 
ctgtatggca 
gtacaacgac 
atatcagatt 
tacgcgttcc 
ctatattctg 
aggctttat.t 
gcaacgtctg 
aaccttcacg 
tcgtctggct 
acgtgtatat 
gcactggttt 
tgcgttgccg 
atgaaagtga 
tgtacagtaa 
tccagtaatg 
ttttcgtccg 
tccagcgtac 
ttaccggaac 
acaatagtgt 
tcactggctt 
tccccacctg 
ccattaacca 
agtacacgca 
ctgaattcac 
cctccctcgg 
ttgttaccgt 
cccgtgacca 
agttgcgtcg 
aagtttttcg 
tacctaagtg 
actctgttta 
aataaaaatc 



tcaacgctgg 
accctgacgg 
aatcccgagg 
tctgactcaa 
attttctata 
gcactgaccg 
attggcctga 
ccgtcagcca 
ggttgcgcct 
gtaatttcag 
ccccggtaac 
gcaggcaaat 
catcagtgat 
cactgtcacc 
tcgttgattg 
caacactccc 
cgctggtggg 
cggcatgcac 
tgtagagctg 
cagagtcttc 
tcccggccag 
gatacaccaa 
acaggccgcc 
ggttaccgcc 
cttcgggatg 
actgaacaat 
actgtgtgca 
cggtattttc 
cgccaatgac 
catcagatcc 
taaaggctac 
aactctgata 
acccttgcaa 



atgcgcaaac 
ccattttcac 
ttgtaacgct 
tccaggtgat 
ttacctttac 
atctggtcgt 
cgtcggcagc 
tcattctgca 
gtatttttac 
agcgttacac 
ttcccctttc 
cgctccctct 
cagccacagt 
ttttaccgta 
ttgcagctct 
ggtcagttca 
tgacaacagg 
ttcagcccga 
aaggccatcc 
accaccgaga 
cataaaaacg 
caattcacca 
ctcttgatcg 
attttccagc 
gcggccatta 
atctgtggca 
tttggcgatc 
ggctcttccg 
ctcgccggtc 
catgatttat 
ggaggattta 
cacgaattat 
atcaacaaaa 



cgctgcgcgg 
ggtttcactg 
ggctgcgcat 
tggcagtggg 
ggcttactgg 
tgaacctatg 
cattatgatg 
acgagcatcc 
gcaggctgga 
agagtttgcc 
acattgatga 
ccgcgcaggc 
tttccaccat 
ctgccgtcta 
gcgccataac 
accggtgcgc 
atcactgtcc 
ccacccacat 
gtccccagac 
ttattcatgg 
ccgccctgat 
ccttcatcta 
agaatgattt 
aacaaaccgc 
accgtagcga 
atggcaccat 
ccaccggttg 
ccagcttcaa 
ttaacaattt 
tcctttgctg 
tccacgacag 
tgggttgtat 
tat 
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